Spectrum of Variable-Random Trees
نویسندگان
چکیده
In this paper, we show that a continuous spectrum of randomisation exists, in which most existing tree randomisations are only operating around the two ends of the spectrum. That leaves a huge part of the spectrum largely unexplored. We propose a base learner VR-Tree which generates trees with variable-randomness. VR-Trees are able to span from the conventional deterministic trees to the complete-random trees using a probabilistic parameter. Using VR-Trees as the base models, we explore the entire spectrum of randomised ensembles, together with Bagging and Random Subspace. We discover that the two halves of the spectrum have their distinct characteristics; and the understanding of which allows us to propose a new approach in building better decision tree ensembles. We name this approach Coalescence, which coalesces a number of points in the random-half of the spectrum. Coalescence acts as a committee of “experts” to cater for unforeseeable conditions presented in training data. Coalescence is found to perform better than any single operating point in the spectrum, without the need to tune to a specific level of randomness. In our empirical study, Coalescence ranks top among the benchmarking ensemble methods including Random Forests, Random Subspace and C5 Boosting; and only Coalescence is significantly better than Bagging and Max-Diverse Ensemble among all the methods in the comparison. Although Coalescence is not significantly better than Random Forests, we have identified conditions under which one will perform better than the other.
منابع مشابه
The Subtree Size Profile of Bucket Recursive Trees
Kazemi (2014) introduced a new version of bucket recursive trees as another generalization of recursive trees where buckets have variable capacities. In this paper, we get the $p$-th factorial moments of the random variable $S_{n,1}$ which counts the number of subtrees size-1 profile (leaves) and show a phase change of this random variable. These can be obtained by solving a first order partial...
متن کاملOn the first variable Zagreb index
The first variable Zagreb index of graph $G$ is defined as begin{eqnarray*} M_{1,lambda}(G)=sum_{vin V(G)}d(v)^{2lambda}, end{eqnarray*} where $lambda$ is a real number and $d(v)$ is the degree of vertex $v$. In this paper, some upper and lower bounds for the distribution function and expected value of this index in random increasing trees (rec...
متن کاملPhase Changes in Subtree Varieties in Random Recursive and Binary Search Trees
Abstract. We study the variety of subtrees lying on the fringe of recursive trees and binary search trees by analyzing the distributional behavior of Xn,k, which counts the number of subtrees of size k in a random tree of size n, with k = k(n) dependent on n. Using analytic methods we can characterize for both tree families the phase change behavior of Xn,k as follows. In the subcritical case, ...
متن کاملA Random Walk with Exponential Travel Times
Consider the random walk among N places with N(N - 1)/2 transports. We attach an exponential random variable Xij to each transport between places Pi and Pj and take these random variables mutually independent. If transports are possible or impossible independently with probability p and 1-p, respectively, then we give a lower bound for the distribution function of the smallest path at point log...
متن کاملOn the spectra of reduced distance matrix of the generalized Bethe trees
Let G be a simple connected graph and {v_1,v_2,..., v_k} be the set of pendent (vertices of degree one) vertices of G. The reduced distance matrix of G is a square matrix whose (i,j)-entry is the topological distance between v_i and v_j of G. In this paper, we compute the spectrum of the reduced distance matrix of the generalized Bethe trees.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- J. Artif. Intell. Res.
دوره 32 شماره
صفحات -
تاریخ انتشار 2008